Tree Searching/Rewriting Formalism

نویسنده

  • Petr Nemec
چکیده

We presents a formalism capable of searching and optionally replacing forests of subtrees within labelled trees. In particular, the formalism is developed to process linguistic treebanks. When used as a substitution tool, the interpreter processes rewrite rules consisting of left and right side. The left side specifies a forest of subtrees to be searched for within a tree by imposing a set of constraints encoded as a query formula. The right side contains the respective substitutions for these subtrees. In the search mode only the left side is present. The formalism is fully implemented. The performance of the implemented tool allows to process even large linguistic corpora in acceptable time. The main contribution of the presented work consists of the expressiveness of the query formula, in the elegant and intuitive way the rules are written (and their easy reversibility), and in the performance of the implemented tool.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Synchronous Context-Free Tree Grammars

We consider pairs of context-free tree grammars combined through synchronous rewriting. The resulting formalism is at least as powerful as synchronous tree adjoining grammars and linear, nondeleting macro tree transducers, while the parsing complexity remains polynomial. Its power is subsumed by context-free hypergraph grammars. The new formalism has an alternative characterization in terms of ...

متن کامل

Rewriting Systems over Unranked Trees

Finite graphs constitute an important tool in various fields of computer science. In order to transfer the theory of finite graphs at least partially to infinite systems, a finite representation of infinite systems is needed. Rewriting systems form a practical model for the finite representation of infinite graphs. Among attractive subclasses of rewriting systems is the class of ground tree rew...

متن کامل

Tree-Rewriting Models of Multi-Word Expressions

Multi-word expressions (MWEs) account for a large portion of the language used in dayto-day interactions. A formal system that is flexible enough to model these large and often syntactically-rich non-compositional chunks as single units in naturally occurring text could considerably simplify large-scale semantic annotation projects, in which it would be undesirable to have to develop internal c...

متن کامل

Transition Graphs of Rewriting Systems over Unranked Trees

We investigate algorithmic properties of infinite transitiongraphs that are generated by rewriting systems over unranked trees. Twokinds of such rewriting systems are studied. For the first, we construct areduction to ranked trees via an encoding and to standard ground treerewriting, thus showing that the generated classes of transition graphscoincide. In the second rewritin...

متن کامل

Large Margin Synchronous Generation and its Application to Sentence Compression

This paper presents a tree-to-tree transduction method for text rewriting. Our model is based on synchronous tree substitution grammar, a formalism that allows local distortion of the tree topology and can thus naturally capture structural mismatches. We describe an algorithm for decoding in this framework and show how the model can be trained discriminatively within a large margin framework. E...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006